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1 Introduction 

THE analysis of large Markov chains is a recurring challenge 
in many important areas ranging from quantitative secu- 
rity HI to computer network dependability and performance 12 , 
|[3|. To evaluate properties of such systems, a standard approach 
is to perform numerical analysis, nowadays often embedded 
in a stochastic model checker ID, IS), ||6l. 111. At its core, the 
model checker has to operate with a very large matrix induced 
by the Markov chain. In this context, the use of symbolic 
representations, in particular variations of decision diagrams, 
such as MTBDDs (multi-terminal decision diagrams) |8J, 19|, 
MDDs |10|, or ZDDs |11| have made it possible to store and 
manipulate very large matrices in a symbolic manner. Many 
of the applications occurring in practice lead to very large 
continuous-time Markov chains (CTMCs) that nevertheless 
contain only a very small number of different transition rates. 
This is a primary reason why decision diagrams — where distinct 
rates are stored as distinct values in the structure — are effective. 
Whenever there are many pairwise different rates occurring, the 
decision diagram degenerates to a decision tree, and thus its size 
explodes. Therefore, models with a high number of different 
rates are a notorious problem for symbolic representations, and 
hence for the stochastic model checkers available to date. 
However, there is a growing spectrum of important appli- 
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cations that give rise to excessive numbers of distinct rates. 
Power grid stability |fT2l, |fT3]|, crowd dynamics lfT4l . ifTSl as 
well as (computer) virus epidemiology 1761 . ifTTl . ifTSll are 
important examples where Markov models are huge and rates 
change from state to state. The study of these phenomena 
is of growing importance for the assurance of security and 
dependability. Several of these examples can in some way 
be regarded as population models |fT9l , 1201 , where the rates 
change with population counts, similar to models appearing 
in systems biology f2T) and also in classical performance and 
dependability engineering Il22ll . Il23l . 

This paper targets the analysis of transient properties of 
CTMCs with both a large number of states as well as a large 
number of distinct transition rates. It presents a combination 
of abstraction techniques, an explicit representation of a small 
abstract model, and symbolic techniques — the latter using 
ordered binary DDs (OBDDs), not MTBDDs or MDDs— , 
which we thus call symblicit. The abstraction method relies 
on visiting all concrete states of the abstract model to obtain 
bounds on the transition matrix, but without having to store the 
state space explicitly. We also present ideas how to speed up 
this admittedly time-consuming process. On the one hand side, 
the approach can be seen as a continuation of our previous 
work on symblicit algorithms ll24l . ||251 . On the other hand 
side, we harvest work done on abstraction of Markov Chains 
to abstract Markov chains or Markov decision processes ll26ll . 

El, ESI, 1291, GUI, EH, oa. 

A number of related methods exist: We build on many ideas 
for analyses using abstract Markov chains by Klink et al. |26|, 
ll28i . This paper extends their works by describing a widely 
applicable abstraction method, and also by handling more 
general transient properties. 

MTBDD-based methods |8|, |9| work well for some models, 
but have the disadvantages described above. The method of Wan 
et al. HO I uses a slightly different data structure to represent 
concrete models, and focuses on steady-state properties rather 
than transient ones. It does not rely on Markov decision models, 
and, though it works well in practice for certain model classes, 
cannot guarantee safe bounds for properties of the concrete 



model. 

Techniques in which a symbolic representation of the 
transition matrix but an explicit values for each state of the 
concrete model needs to be stored, as for instance in the so 
called hybrid method [81, l33]| . or using other variants of DDs 
ifTlJ . and also methods using Kronecker representations [34], 
||35]| . Il36l . l37l . Il38l are more precise and might be faster than 
the method we propose. They are however not applicable in 
case the state space is excessively large, too large to store one 
value per state. 

Smith (TT\ developed means for the compositional abstrac- 
tion of CTMCs given in a process calculus. This way, he 
obtains an abstract Markov chain, which is then analysed by 
a method of Baier et al. |39| to obtain bounds for the time- 
bounded reachability probability. Our approach uses a different 
abstraction method and can handle a more general class of 
properties. 

A paper by Buchholz 1291 describes how bounds on long- 
run average (thus, non-transient) properties can be obtained 
from abstract Markov chains. It is based on a combination of 
policy and value iteration |40|, and discusses the applicability 
of several variants of these methods on typical examples from 
queueing theory and performance evaluation. 

The magnifying-lens abstraction |30| by de Alfaro et al. is 
similar to our approach in that it also builds on (repeated) visits 
of concrete model states without storing the whole concrete 
state space. It discusses a different model, discrete-time Markov 
decision processes (DTMDPs), and a different property, time- 
unbounded reachability probabilities. 

D'Argenio et al. |41| discuss how DTMDPs given as 
MTBDDs can be abstracted to obtain a smaller abstract model, 
which is also a DTMDP, but small enough to be represented 
explicitly. In addition, a heuristic abstraction refinement method 
is presented. The target there was to obtain bounds for 
unbounded reachability probabiUties. Works by Hermanns et 
al. Il32l and by Kattenbelt et al. OTI later developed methods 
to use probabilistic games to provide tighter value bounds 
and predicate abstraction to handle larger or even infinitely 
large models, as well as refinement methods based on these 
frameworks. In contrast to the state-of-the-art for discrete-time 
models, the discussed refinement method we consider is more 
preliminary. 

Other methods work with a finite subset of concrete states of 
the model under consideration, rather than subsuming concrete 
states in abstract ones. Doing so is possible for transient 
analyses, as there often the probability mass is concentrated on 
a small subset of states at each point of time, rather than being 
equally distributed among all states of the model. There exists 
a wide range of methods exploiting this fact in the analysis 
of CTMCs Ell, El, ||44l|, |P5|. Recently, this approach was 
extended to infinite-state Markov decision processes |46 |. Here, 
two finite submodels are constructed which guarantee to bound 
the values over all policies from below and above. They can 
also be used to obtain a policy which is e-optimal in the 
original model. 

Such methods work well in case there is a moderate number 
of states with relevant probability, out of the very large number 
of all states the complete model consists of. If this is not 



the case, these methods either need too much memory for 
storing the state information, or the result becomes imprecise 
because too much of the probability mass is not taken into 
consideration. 



This paper is structured as follows: In Section 2 we provide 



basic notations and describe the symbolic data structures used 
for the later abstraction. We also describe the formal models 
we use, as well as the properties we are interested in. [Section 3| 
describes algorithms to efliciently obtain an abstract model 
from a description of a concrete model, and discusses how 
they can be used to bound properties of the concrete model. 



In Section 4 we apply this method on two case studies, thus 
to show its practical applicability. Finally, [Section 5] concludes 
the paper. 

2 Preliminaries 

This section gives basic notations and formally defines the 
models and data structures that are used in the later parts of 
the paper. 

A distribution over a finite or countable set A is a function 
yu: A — > [0, 1] such that YjaeAl^i'^) = 1- By Distr(A) we denote 
the set of all distributions over A. 

The simplest stochastic model we consider is as follows: 
Definition 1: A discrete-time Markov chain (DTMC) is a 
tuple D = {S,P) where 

. 5 is a finite set of states, 

• P: (S X S) [0, 1] is the probability matrix such that 
Z,'es P(s,s')^ 1 for all seS. 

By X^'"" : (Q© X N) -> 5 with so e 5 we denote the unique 
stochastic process |47| of D with initial state so, where is 
the sample space to be used. 

The time in a DTMC proceeds in discrete steps, and in each 
step a state change takes place. At step the model starts in a 
given initial state so ^ S . The model moves to the next state, 
and will be in ii with probability P(io, >?i) for all ii e S . From 
there, again the next state is chosen according to P, and so on. 

By Pr, we denote the probability measure on the measurable 
spaces (Qd, S©) of the DTMC D under consideration, which 
is defined by the cylinder set construction over finite paths f^Sl. 
For instance, Pr(X,f - si VX^'* = S2) describes the probabil- 
ity that, having started in state sq, in step n we are in S] or in 
step n -I- 1 we are in S2- For a measurable function X : — > R 
we thus also have an expectation E(X) = L X(cij) Pr(do>). For 

instance, consider X = H/lTo (/ ° -^f "'°) such that /(si) = 1 
and f(s) = else. Then E(X) denotes the average number 
of steps within the first n steps in which the DTMC is in si, 
under the condition that we started in sq. 

We now discuss our basic stochastic model described 
informally in the introduction. 

Definition 2: A (uniform) continuous-time Markov chain 
(CTMC) is a tuple C = (5,R) where S is as in [Definition 1 1 
and 

• R : (S xS) ^ R>o is the rate matrix such that there is a 
uniformisation rate u(C) > with Yjs'es ^i^^ - u(C) 
for all s e S . 
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If C is clear from context, we write u instead of u(C). By 
^c,.!o . ^^^-^ _^ g ^j^jj 6 5 we denote the uniquely 

defined stochastic process [47J of C with initial state s^, where 
Q.C is the sample space to be used. 

In a non-unifonn CTMC, the requirement Yjs'es ^(^' ■^') - u(C) 
does not hold for all states. Every non-uniform CTMC can be 
be transformed into an equivalent uniform CTMC with the same 
stochastic behaviour by increasing R{s, s) such that the total 
sum is the same for all states |47|. We require uniformity only 
for ease of presentation, it does not restrict the applicability 
of the methods developed here to general CTMCs. 

The behaviour of a CTMC C = (5,R) is similar to a DTMC. 
However, the durations until state changes are now real numbers. 
They are chosen according to independent negative exponential 
distributions with parameter u. Thus, the probability that a 
state change takes place within time f is 1 - exp(-uf). The 
successor state is then selected according to the distribution 
fi: S [0, 1] with fi{s') = 5<iMi fQj- all 5' e 5. We assume 
that the process runs until a certain point of time t is reached. 

As for DTMCs, we assume that we have probability measures 
and expectations on the sample spaces. 

We need to specify another discrete- and a continuous-time 
model i49l . Il50l . which will later on be used to abstract large 
CTMCs. In addition to stochastic behaviour, these models 
also feature a nondetenninistic choice over the successor 
distributions. Nondeterministic choices are choices which 
cannot be assigned a probability a priori. Instead, different 
stochastic behaviours result according to the resolution of the 
nondeterminism. 

Definition 3: A discrete-time Markov decision process (DT- 



MDP) is a tuple D = (S,Act,P) where S is as in |Definition 1 
and 

• Act is a set of actions, 

» P: {S xAct X 5) — > [0, 1] is the probability matrix such 
that Yjs'es ^ {0' 1} for all s e 5 and a e Act. 

For s e S, we denote the set of enabled actions with 
Act{s) " {a e Act \ Xij-gs P(i, a, i') = 1). We require that 
either \Act\ < 00 or that for all i e 5 and all p : 5 — > R>o the 
set {2j'es PC'S; o:, s') ■ p(s') | a 6 Act{s)] is compact. 
The behaviour of a DTMDP is such that upon entering a state 
i e 5, an action a € Act{s), or possibly a distribution over 
actions, is chosen. This choice determines the probabilities of 
the state the model moves to in the next time step. Notice that 
we do indeed allow uncountably many actions, with the given 
restriction. 

Definition 4: A (uniform) continuous-time Markov decision 
process (CTMDP) is a tuple C = {S,Act,R) where S and Act 
are as in pefinition 3| R: {S xActxS) — > R>o is the rate matrix 
such that there is a fixed u(C) where for all i G 5 and a € Act 
it is Yjs'es R{s,a, s') e {0, u(C)). If C is clear from the context, 
we write u instead of u(C). For s e S , we denote the set of 
enabled actions with Act{s) = [a e Act | Yis'es ^{'•^ ct, s') - u}. 
We require that either \Act\ < 00 or that for all i e 5 and 
all p: S R>o the set {Zs'es ' p(s')\a e Act(s)} is 

compact. 

As in a DTMDP, upon entering a state s, an action a e Act(s) 
(or a distribution over this set) is chosen to determine the 
distribution over the successor states. As for CTMCs, the model 



moves to this successor state after a time given according to 
the negative exponential distribution with parameter u. 

We need the following transformation from continuous-time 
to discrete-time models. 

Definition 5: Given a CTMDP C = (5,Acf,R), the em- 
bedded DTMDP is defined as emb(C) = {S,Act,P) with 
P(i, a, s') " for all s,seS and a € Act(s). 

We introduce a formalism to specify CTMDPs, extending the 
abstract Markov chains by IGink et al. ||26| . l28ll . IMJ, which 
is also a specific form of a constraint Markov chain ll52l . The 
purpose of this model is to efficiently represent CTMDPs with 
a large number of actions. Instead of explicitly enumerating 
all possible choices over successor distributions, it allows to 
specify lower and upper bounds on the rates between states. 

Definition 6: An extended abstract continuous-time Markov 
chain (ECTMC) is a tuple C = {S,Act,\\V) where S is 



Definition 3 



and Act is a finite set. We consider the 
uniformisation rate u(C) of the model. The intervals are partial 
functions of the form I^F: {S xA^t) ^ {S ^ [0,u]). We 
require I'^ and I" to have the same domain. In addition, for 
each s eS there must be at least one a such that \^{s,a) is 
defined. 

The CTMDP semantics of an ECTMCjs defined as |C] = 
{S,Act, R). We have (a, v) e Act if a e Act and if v is of the 
form v. S [0, u] with Yjses ^(^) - U- It is {a,v) € Act{s) 
if a) and l"{s, a) are defined and for all i' G 5 it is 
v(s') G [lUs,a)(s'),V'(s,a)(s')]. We let R(i, (or, v), s') = v(s'). 
An ECTMC thus represents a CTMDP in which for each state 
s one chooses a possible action a of Act. In addition, one 
has to choose an assignment of successor rates which fulfil 
the requirement on the intervals. This way, the action set is 
uncountably large, but satisfies the requirements of [Definition 4[ 
The difference to the model of Klink et al. is the choice of 
a 6 Act before the choice of the successor rates. This allows 
to obtain more precise abstractions than we could obtain if we 
were using (non-extended) abstract Markov chains, while it 
still allows to implement efficient analysis methods, as seen 
later in [Section 3] and [Section 4l 

To obtain a stochastic process from nondeterministic models, 
the nondeterminism must be resolved. Schedulers (or policies) 
formalise the mechanism to do so. Below, we define the most 
powerful class of schedulers we consider in this paper, and 
the stochastic processes they induce. A scheduler of this class 
can resolve the nondeterminism according to the states and 
actions (and their sequence) that were visited before the model 
moved to the current state. It may also decide not to pick one 
specific action, but rather involve a probabilistic choice over 
the enabled actions of a state. It is however neither aware of 
the exact time at which former events happened, nor of the 
current time. 

Definition 7: A time-abstract, history-dependent, random- 
ised scheduler (HR) for a DTMDP D = (S,Act, P) or a CTMDP 
C = (S,Act,R) is a function cr: {(S xAct)* X S) ^ Distr(Act) 
such that for all jS G (5 x Act)* and s e S we have that if 
o-(J3, s)(a) > then a e Act(s). With I,hr we denote the set of 
all HRs. 

Definition 8: Assume we are given a CTMDP C - 
(S,Act,R) and a HR cr: ((S x Act)* x S) ^ Distr{Act). We 
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define the induced CTMC as Co- = (S',W) with 
. S' = (S xActyxS, 

. R'i(j3, s), OS, s, a, s')) " a-ifi, s){a)-'R{s, a, s') for e (5 x 
Act)*, s,s' eS, a e Act, and R'(-, ■) = else. 
Let X'^-'" : {Qc^ x R>o) {{S xAct)* x 5) be the stochastic 
process of the CTMC Co- with initial state so e S and let 
/: {{S X Act)* X S) ^ S with f(fi,s) = s. The induced 
stochastic process X'^'"''^" : (Qc^xR>o) — » 5 of C and cr starting 
in SQ is then defined as X^'"'-'" = / o for f e R>o. 

Definitions for DTMDPs are likewise using P instead of R. 



We specify a simpler subclass of the schedulers of Defini 



tion 7 Schedulers of this class are only aware of the number of 



state changes that have happened so far, and may only choose 
a specific successor distribution rather than a distribution over 
them. 

Definition 9: A time-abstract, history-abstract, counting, 
deterministic scheduler (CD) for a DTMDP D = (S,Act,P) 
or a CTMDP C = (5,Acf,R) is a function cr: (5 x N) ^ Act 
such that for all s G 5 and n e N if cr(s, n) - a then a e Act(s). 
With EcD we denote the set of all CDs. 

Definition 10: Assume we are given a CTMDP C = 
(S,Act,R) and a CD cr: (5 xN) ^ Act. We define the induced 
CTMC as = (S',R') with 

. 5' = 5 xN, 

. R'((i, n),(i',n + 1)) = R(s,o-(s,n), s') for 5, e 5 and 
« e N, and R'(-, ■) = else. 
Let X^-*" : (Qc, xR>o) (S xM) be the stochastic process of 
the CTMC and let /: (5 x N) ^ 5 with = s. The 

induced stochastic process X''-""'*" : x R>o) — » 5 of C and 
cr starting in sq is then defined as xf'"^'^" = / o for 
t e R>o. Definitions for DTMDPs are likewise using P instead 
of R. 

We equip our models with reward structures, assigning 
values to states. 

Definition 11: A reward structure for a stochastic process 
X: (Q X R>()) -> 5 or a CTMC or CTMDP with state set S is 
a tuple (re,r/) with r^: 5 ^ R>o and r/: S — > R>o. We call 
Tc the cumulative reward rate and ry^ the final reward value. 
We let r™" = max.sES r/(i) and r™" = max.5ES r^Ci). 

For CTMCs, the cumulative reward rate r^s) is the reward 
obtained per time unit for staying in state s, until the time bound 
t is reached. The final reward value r/(s) specifies the reward 
one obtains for being in state s at time t. We are interested 
in the expected values of these numbers, as formalised in 
[Definition 12[ For CTMDPs, we strive for the maximal (and 
analogously the minimal) value under all possible schedulers 
in the class we considered. 

Definition 12: Given a time bound t e R>o, the value 
of a stochastic process X: (Q x R>o) — > S with a reward 
structure r - (rc,r/) is defined as V(X, r, t) = E[^rc(X„)dM + 
r/(Xt)]. For a CTMC C = {S,R) and s^) e S, we let 
V(C,i(),r,t) " V(X'^''°,r,t). For a CTMDP C = {S,Act,R), 
the maximal value for £ is defined as V""(C, io>r, t) = 
max„eE™V(XC'--^°,r,t). 

The interpretation of rewards and values depends on the model 
under consideration. For instance, in a CTMC representing a 
chemical reaction, we might assign a final reward value of n 



to state i if 5 contains n molecules of a given species. This 
way, the value of the CTMC represents the expected number 
of this species at a given point of time. 

We do not explicitly consider impulse (instantaneous) re- 
wards r, : {S X S) — > R>o for CTMCs here, that is rewards 
obtained for moving from one state to another. However, 
given cumulative reward rates and impulse rewards r,-, 
we can define cumulative reward rates as r'^{s) = vds) + 
Yjs'es ■ For the properties under consideration, 

this new reward structure is equivalent to the one which 



uses impulse rewards (follows from [53 (6)]). Definition 12 
resembles the approach considered in a recent paper ll54iT 
where we maximised the value over a more general class of 
schedulers than the one of IDefinition 71 

An important special case of the value is the time-bounded 
reachability probability, as it occurs for instance in the time- 
bounded until property of the probabilistic logic CSL (55]. 
Given a set of target states B, we can express the probability 
to be in B at time f by using a reward structure with rds) - 
for all 5 e 5, and rf{s) = 1 if 5 e B and rf(s) = else. For the 
probability to reach B within time f, we additionally modify 
the rate matrix R such that R{s, s) - u and R{s, s') = for 
i' i if i € B. For CTMCs, the case to reach B within an 
interval [a,b] with < a < b can be handled by two successive 
analyses 1551 Theorem 3]. The unbounded until ([0, oo) and 
[a,oo)) can be handled similarly |56, Section 4.4]. 

The fact that IDefinition 12l involves both final and cumulative 
rewards for CTMCs allows to express the values at different 
points of time in the following way. Assume we want to 
consider the cumulative reward rates vi,V2, . . . at consecutive 
time points ti = 5i, ta = ti -H i52, . . . A short calculation then 
shows that v, = V(C, (r^O), V2 = V(C, (r^, vi), ^2), ■■■ 
The formulation for the final reward at different points of time 
is likewise. 

Prism Q is a widely used tool, which features a guarded 
command language to model CTMCs (among other classes). 
For our purposes, it suffices to take a rather abstract view on 
the high-level modelling language used by this tool. 

Definition 13: A Prism model (PM) is a tuple m - 
{Var, init,C, succ, Rc, Rf) where Var is a set of Boolean vari- 
ables, with init: Var — > {0, 1) we denote the initial state and C 
is a finite set of commands. Let Svar = {-^- Var — > {0, 1)). The 
cumulative reward rate is a function R^ : Svar^ K->{) is the 
final reward value Rj : Svar — > R>o- The successor function is a 
partial function of the form succ: {Svar^C) (5v&;-xR>o). We 
define succ{s, c) = s' if succ{s,c) — is',A) for some A e R>o. 
We also let 



succ(s) = |(i', /I) j 3c.iMcc(i, c) = s' 



A':3c' .(s' ,A')-succ{s,c') 

Further, succ(s) [s' | 3A.(s',A) e succ{s)]. Let siuxP = init 
and succ'^^ = {succ(s) | s € succ'}. The set of reachable states 
(state space) is 5,„ = \J°1qSucc' . We require that there is 
u(m) > such that for all s e S,„ it is u(m) = | 3s' .(s'. A) e 
succ(s)] (where the latter is a multiset). 
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The complete Prism syntax also defines models consisting of 
several modules, that is, sets of guarded commands, which may 
synchronise or interleave. However, as the semantics of a model 
with several modules is defined as one with a single module, a 
single set of commands suffices. Prism also allows to specify 
commands with several pairs of successors (s[,Ai), . . . , (.sj,, A„). 
For PMs describing CTMCs, such a command is equivalent 
to a set of n commands c, in the above form: Each of them 
must be activated (defined) in the same states as the original 
command, and we then have succ(ci) = The bounded 

integers Prism supports can be represented by a binary encoding. 
Impulse rewards can be transformed to cumulative rewards, as 
discussed for CTMCs. For models in which there is no u(m) 
with the required property, we can add an extra command to 
increase the self-loop rate where necessary. 

The formal semantics of a PM is as follows. 

Definition 14: Consider a PM m - (Var, init, C, succ, Rc,Rf) 
The induced CTMC is C,„ = (S„,, R) such that for all s, s' e S„, 
we have R(s, s') ^ A if {s',A) e succ(s) and R(5, = if 
no such tuple exists. The induced reward structure is r„, = 
(RcRf). 

In [Section 3l we will abstract CTMCs into ECTMCs. To do 
so, we will subsume several concrete states of a CTMC to 
abstract states of an ECTMC. 

Definition 15: Given a PM m, a partitioning of the state 
space Sm is a finite ordered set *P - (30, ■ ■ ■ , 3n-i) of non-empty, 
pakwise disjoint subsets of such that S,„ = Um) ^i- 

Binary decision diagrams |57| are an efficient tool to symboli- 
cally represent structures which are too large to be represented 
in an explicit form. 

Definition 16: We fix a finite ordered set V = (Xj, . . . ,x,„) 
of Boolean variables. A binary decision diagram (BDD) is a 
rooted acychc directed Graph b with node set N and root node 
rii-oot. There are two types of nodes in N: outer nodes n, which 
do not have out-going edges and which are labelled with a 
value v(n) e {0, 1}. The remaining nodes are inner nodes n e N, 
which have exactly two successor nodes, denoted by h(n) and 
l(n). Inner nodes n are labelled with a variable v(n) e V. 
A variable valuation is a function v: V ^ {0, 1). We denote the 
set of all variable valuations by Val. Each valuation v induces a 
unique path in the BDD from the root node to an outer node: At 
an inner node n we follow the edge to h(n) if v(v(n)) = 1 and 
the edge to l(n) if v(v(n)) = 0. The fiinction |b] : Val ^ {0, 1) 
represented by a BDD b retums for a variable valuation v the 
value of the outer node reached by following the path induced 
by V. 

Definition 17: A BDD is ordered if for all inner nodes n 
the following condition holds: Either h(n) is an outer node 
or v(n) < v(h(n)) and the same for l(n). A BDD is reduced 
if all sub-BDDs rooted at the different nodes of the BDD 
represent distinct functions. Reduced and ordered BDDs are 
called OBDDs. 

The OBDD for the constant (1) function, which consists 
of a single outer node labelled with 0(1, resp.), is denoted in 
the sequel by bddo (bddi, resp.). 

OBDDs are a canonical representation (up to isomorphism) 
of arbitrary functions / : Val — » {0, 1) [571 . In the following 



we will only use OBDDs. For more details on (O)BDDs we 
refer the reader to f57l, f58l. 

OBDDs support a wide number of operations like the 
Boolean operations A, V, and -1. Given two ordered sets 
Vi = <X;,, . . . ,x,- ) and V2 = (x^,, . . . ,Xy_ ) of Boolean variables, 
by b' = b[Vi/V2] we denote the OBDD which results from re- 
naming the variables in Vi to the corresponding variables in X2. 
For V c V and OBDD b, we let pV'.b](v) = V{Ibl(v') | Vx i 
V'.v'(;c) = v{x)] be the existential quantification of the variables 
in v. 

We can use OBDDs to represent PMs in a symbolic form, 
if we leave out the stochastic aspects. 

Definition 18: Consider a PM m — {Var, init, C, succ, 
Rc,Rf). The OBDD representation of m is a tuple b,„ = 
(V&r, V&r', init, {sucCclcec)- There Var and Var' are sets of 
Boolean variables with Var n Var' - such that there is a 
one-to-one mapping between variables x e Var and x' e Var' . 
Further, init and sucCc are OBDDs over the variables Var 
and Var U Var' , respectively. We require that |init](v) = 1 iff 
for all X G Var it is init{x) = v(x). For all SUCCc we require 
|sucCc](v') = 1 iff for all x € Var it is v(x) = s(x), v(x') - s'(x) 
and succ{s, c) - s'. By succ we denote the OBDD such that 

ISUCCI = VcecIsUCCj. 

OBDDs can also be used to symbolically represent a 
partitioning of the state space of a PM. Let - (30, • ■ ■ , 5«-i) 
be a partitioning of the PM m - (Var, init, C, succ, Rc,Rf). The 
idea is to assign to each block 3, of a unique block number 
/ and to use a binary representation of /, which is encoded 
using k - \\0g2 "1 novel BDD variables 33 - {xo, . . . , Xk-x)- 

Definition 19: The OBDD representation of ^ — 
(30, ■ ■ ■ , 3n-i) is the OBDD b«;i over the variables 33 W Var, 
where 23 = (iq, ■ ■ ■ , lyt-i) with k = flogjnl. We require that 
Ib>p](v) = 1 iff there is i € 5,,, such that for all x G Var we 
have v(x) - s(x) and there is 3 G such that s e ^ and for all 
I G 23 we have v(i) - 3(1). With 3,(1^) = (/ div 2-') mod 2 we 
denote the value of variable Xj G 23 in the binary encoding of 
the block number ; of 3, G With b, we denote the OBDD 
such that [[b,](v) = 1 iff v represents a state of 3. 

Regarding the variable order of Var W 33, we assume in the 
following that all variables in Var precede all variables in 23. 
This leads to more efficient algorithms for accessing the block 
number of a given state. 

3 Algorithms 

In this section, we first describe an algorithm to approximate 
minimal and maximal values of CTMDPs. Afterwards, we 
describe how to obtain an ECTMC from a PM, such that its 



induced CTMDP is a valid abstraction (cf. Proposition 2 1 of 
the CTMC semantics of the PM. We provide an algorithm 
which computes an ECTMC overapproximation of a PM given 
in an OBDD representation. Using the first algorithm, we can 
obtain intervals from this abstraction which are guaranteed to 
contain the actual value of the CTMC. 

3.1 Computing Reward Values for CTMDPs 

Let 0^(0 = f^^ ' denote the probabilities of a Poisson 
distribution with parameter A, and let ip^i) = TjJ=i+i 4>aU) - 

i-E;=o'^-iO-). 



5 



The algorithm to compute the maximal values of CTMDPs 
is given in [Algorithm 1| The input is a CTMDP C with reward 
structure r = (rc,ry), and the precision e > up to which the 
values are to be computed. The algorithm for the minimum 
is likewise, replacing max by min. The requirement on the 



Algorithm 1: Compute maximal values forC - {S,Act,R), 
r = (r<.,r/) up to s. 



1 let k s. t. Zlo Mi) > ut - 2^ A lAut(^) ■ 

2 C = (5,Acr,P) :=emb(C) 

3 forall s e S do qt+iis) 

4 forall i - k,k - I, ... ,0 do 

5 ' " 
6 



forall s € S do 

m := max 2 

aeAct(s) s'eS 

qi(s) := m + <put(i) ■ r/(i) + i^atd) 



rA") 



8 return go 



actions in pefinition 4| assures that the maximum in [Algorithm 1| 
exists. We can also directly apply this algorithm on ECTMCs 
without constructing the uncountably large induced CTMDPs. 
The crucial part here is the optimisation over the uncountable 
actions, which can be done using a slight adaption of methods 
from 1261 Chapter 4.1]. There, optimising the assignment of 
successor rates with restrictions given by lower and upper 
bounds is already described. Because of this, for each s e S 
we can apply the method described there for each a such 
that 1^(5, a) (and thus I") is defined, thus to find the optimal 
va^ : 5 — > [0, u] for this a. Afterwards, we choose the optimal 
va: S [0,u] among all a, which is easy as there are only 
finitely many. 

Proposition 1: Let C = (S,Act,R) be a CTMDP with 
reward structure r - (r^r/). Then there exists cr e "Zcd 
such that V^^CC so, r, t) = V(X'^-'^-'°,r,t) for all so e S . 
Further, the return value qo of [Algorithm 1 1 is such that 

|V"-(C,^o,r,t)-^o(*o)l <e. 

Proof sketch: At first, we show that we can simulate 
each history-dependent randomised scheduler by a randomised 
counting scheduler (CR). In contrast to CD, these schedulers 
may be randomised, but, as CDs, only know the number of 
steps which have passed rather than the full history. For this, 
we use a result about discrete-time Markov chains. Next, we 



show that [Algorithm 1 [ cannot yield values which are larger 
than the maximal value resulting from such a CR. Then, we 
show that the algorithm does not return values which are larger 
than the value obtained by any CR plus the specified precision. 

Looking at the decisions the algorithm takes at [Line 6] we 
can reconstruct a prefix of the decisions of a CD. By letting 
the precision approach 0, we can show that there is indeed a 
complete CD yielding the same value. □ 
The full proof can be found in [Appendix A[ 




Fig. 1. Example to show that we do not compute exact 
extremal interval-bounded until probabilities of CTMDPs. 



It is also related to an earlier work in queueing theory li59l . 
which is however different in a number of ways. The target 
there was to obtain approximations for a more general class 
of schedulers of CTMDPs than we need here, and thus does 
not consider maxima over HRs explicitly. It assumes a fixed 
maximal number of steps to happen in the uniformised DTMDP, 
rather than deriving the necessary number, as we do in our 
algorithm. Ii59l is also more involved with models featuring a 
particular structure rather computing conservative bounds on 
properties of CTMCs. 

The next proposition states how CTMDPs can be used to 
overapproximate CTMCs. 

Proposition 2: Let C = (5,R) be a CTMC with reward 
structure (r^r^) and let '^5 = (So> ■ ■ • > be a partitioning 
of 5. Consider the CTMDP C " (^,Acf,R') where for each 
3 e '!p and s e ^ we find as e Act such that for all 3' e we 
have R'(3, aj,3') = Es'er *')■ Further, consider a reward 
structure (r^, r^) such that for all s e ^ it is r^(5) > max .5^3 rds) 
and r^(3) > maXjg, r^(i). Then for all 30 £ and sq e 3o we 
have V(C, ^o, r, t) < ¥"^"(0', 30, r', t). 

Proof sketch: We can construct a scheduler cr such that 
the embedded DTMDP of C mimics the behaviour of the 
embedded DTMC of C, so that in each step the probability to 
be in a given abstract state 3 is the sum of the probabilities of 
being in a state s of C with 5 e 3. By the definition of reward 
structures, the value obtained in C using cr is as least as high 
as the value in C. As the maximal value in C is at least as 
high as the one using cr, the result follows. □ 
The full proof can be found in [Appendix B| 

As mentioned before, CSL bounded-until properties can be 
expressed using rewards. Their probabilities can thus also be 



Algorithm 1 generalises an approach from a previous bounded using Algorithm 1 Notice, however, that for the case 



paper about time-bounded reachability |39| using results by 
Kwiatkowska et al. |53|. Its correctness also proves that 
deterministic counting schedulers suffice to obtain optimal 
values, because the algorithm implicitly computes such a 
scheduler 



of intervals [a, b] or [a, 00), the successive application of the 
algorithm only bounds the value in the CTMC that has been 
abstracted. It does not guarantee to yield the extremal until 
probabilities of the CTMDP over HRs in these cases; we only 
compute a bound for these values. 
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Example 1: Consider the CTMDP C in [Figure l] which 
is adapted from previous publications ll60l . 11541 . The only 
difference is that previously non-uniform CTMDPs were used, 
that is CTMDPs in which the sum of leaving rates may be 
different for each state. Here, we have increased the self-loop 
rates thus to obtain a uniformisation rate of u = 10. States 
of the model are given as circles, in which the state name is 
written. If there is more than one activated action in a state, 
we draw all of them as small black circles connected to the 
corresponding states. Non-zero transition rates are then drawn 
starting in the corresponding black circle. In case there is just 
one possible action in a state, we directly draw the transitions 
from this state. Rates with value are left out. 

We assign a cumulative reward rate of to each state. The 
final reward is for all states except for .S3, where it is I. 
Intuitively, if we want to maximise the value, the optimal 
choice of the action depends on the amount of time left. If 
much time is left, it is best to choose p in sq, because this 
action leads to a sequence of states which, given an infinite 
amount of time, always reaches 53. If little time is left, it is 
better to choose a, because then 53 can be reached quickly, 
although there is a significant chance that this state will not 
be reached at all. 

With this model and reward structure, we can compute 
the probabilities of the CSL formula !P (frwe 1/1"'"*^ .53) by 
computing the value for t = 4. Because the state 53 is absorbing, 
this value is also the probability of the interval-bounded until 
property vi^true 11^^''^^ S3). 

In this special case, we can thus compute this probability 
using the method developed in this paper or by the pre- 
vious algorithm by Baier et al. ||39l . For sq, this value is 
¥""(0, so,(0,r/),4) 0.659593. 

In case this CTMDP is the abstraction of a given CTMC, 
we can apply two consecutive analyses to bound the interval- 
bounded until probability. For this, we first compute v( ) = 
V™"(C, So, (0, r/), ■, 3), and afterwards we consider v'(0 = 
V^'^CC so, (0, v), ■, 1). We now have v'(so) * 0.67 1162, which 
shows that this value is an upper bound for the until probability 
in the original model. It is indeed between the value obtained 
using HRs as discussed previously and the one obtained using 
time-dependent schedulers fSOl, |'54l. This value is larger than 
the one considered in the last paragraph, which shows that a 
consecutive analysis does not yield the maximum interval- 
bounded reachability probability over all HRs in a given 
CTMDP. The reason that this happens is that by dividing the 
analysis into two parts, the schedulers of the two analyses can 
change their decisions more often and obtain more information 
than they are supposed to have. Instead of only having the 
information that the objective is to optimise the reward until 
t = 4, is is now also known whether t < 1 or not. 

From the discussion of the values of consecutive time points 
ti = i5i,t2 = ti -H 62, . . ., it follows that we can also use the 
algorithm to compute bounds for them in an efficient way. 
Instead of doing analyses with time bounds ti,t2, . . ., we only 
have to do analyses with values S\,S2,---, which might be 
much lower than ti,t2,... Thus, the fact that the algorithm 
allows to handle final and cumulative rewards at the same time 
has the potential to speed up such a series of analyses. 



3.2 Abstracting Prism Models 



To take advantage of [Proposition 2| we want to avoid to actually 
construct the CTMC to be abstracted. Doing this allows us 
to handle models which are too large to be handled in an 
explicit-state form. For this, we can use non-probabilistic model 
checkers which feature a guarded-command language, like 
NuSMV [61 1. Such a tool can work with an OBDD-based 
representation of PMs as in [Definition 18| and compute the 
set of reachable states. We can then specify some OBDDs 
representing predicates, i.e., sets of concrete states. These can 
be used to split the state space, by subsuming all concrete 
states that are contained in the same subset of predicates, thus 
to obtain an OBDD partitioning as in [Definition 19[ 

Next, we consider the ECTMC abstraction of a given Prism 
model. 

Definition 20: Consider a PM m - (Var, init, C, succ, Rc,Rf) 
with induced CTMC C = (5,R) and a partitioning - 
(30, ■ ■ ■ , of its state space. The ECTMC abstraction of m 
is defined as C = (^,Aa',I^F) with Art = {a: C We 
let A(3, a) denote the set of all s e ^ such that Dom{succ{s, ■)) - 
Dom(a) (that is, the domains of the two partial functions agree) 
and for all applicable c e C we have succ{s, c) e a(c). We then 
choose the domain of l'^ and I" such that for all 3 € it is 

Dom(l\i, ■)) = £>oot(I"(3, ■)) = {a e Art I A(3, a) + 0). 



Then for %, e ^ and a e Dom(I"(3, ■)) we define 



r^aXs') = min 

.9GA(5,tl') 



and accordingly I" using max. The abstract reward structure 
r = (rc,r/) is defined as rc(;5) = maXjgj 7?c.(s) and r/(^) = 
maXjE, 7?/(s). 

By construction, the CTMDP semantics of the ECTMC fulfils 
the requirements of [Proposition 2 [ for a correct abstraction of 
the CTMC semantics of the Prism model. It is also monotone 
in the sense that, by using a refined partitioning, we cannot 
obtain worse bounds than with a coarser partitioning. 

Proposition 3: Consider a PM m — (Var, init, C, succ, 
Rc,Rf) with a partitioning - (3o, • • ■ , of the state 
space of its induced CTMC, and a further partitioning ^^5' = 
<3o, ■ ■ ■ , 4-1 ) ^^^^ that for each s, e *p we find . . . , 3;„ e 
such that s, = U"=i 

Then forjwo ECTMC abstractions C = (^:p,Art,I^I") and 
C = ,Act' ,P' ,¥") with corresponding reward structures r 
and r' we have ¥^"(0, 3,, r, t) > ¥'^"(0', 3;^, r', t). 

Proof sketch: We can show that for an arbitrary £ > 0, 
we have that V™"" of a state s of the original partition plus 
s is at least equal to the value of a state 3' of the refined 
partition for which we have 3' c 3. This implies that the same 
holds for £ = 0, which means that the value of a state of the 
refined partition cannot be higher than the one of the original 
abstraction. 



We use the fact that we can apply Algorithm 1 to compute 
values up to any precision e > 0. Consider the runs of the 
algorithm on the original and refined partitioning. Before the 
execution of the main loop at [Line 4] we have that qk+\{^) - 
Qk+ii^') — 0. For each execution of the main loop, we have 
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a maximising decision in 3', leading to v,(s') to be added to 
qi+iii') to obtain qii^'). We can construct a decision for s such 
that v,(3') < v,(s)- This means that in each iteration of the main 
loop, the qj of 3' can never become larger than the one of 3. 
Thus, after the termination of the main loop and the algorithm, 
the value obtained for the coarser abstraction is at least as high 
as the one in the refined abstraction. □ 
The full proof can be found in [Appendix C 



With a given partitioning, we can apply Algorithm 2 



to 

obtain an abstraction of the model. The algorithm computes 
an ECTMC to provide an upper bound of the model value; a 
corresponding algorithm for the lower bound can be defined 
likewise. 

The algorithm does some initialisations and afterwards, in 
line |5] calls Algorithm 3[ This algorithm descends into the 
OBDD partitioning (lines |3] to 14 1, visiting each state of the 
model explicitly. When a specific state s contained in an abstract 
state 3 is reached (lines [16] to [29| , we extend the abstracting 
model to take into account the behaviour of this state. In lines 



17 and 18 we extend the upper bounds for the reward rates of 



3 such that they are at least as high as those of s. Notice that 
to compute the reward rates of this state we use the original 
high-level PM, not the OBDD representation. Then, in lines 
[20] to ]29] we handle the transition rates, thus to include the 
rates of the concrete state. We again use the high-level model, 
this time to compute the set of commands that are enabled in 



the current state (line 20 1, the corresponding action a (line 21 



cf [Definition 20| l, and the concrete successor states with their 
corresponding rates (line |22|l. For each of them we use the 
function sAbs to obtain the abstract state it belongs to. For 
all successor states we add up the rates to the same abstract 



state (line 26 1. Then, starting from line 27 we apply the actual 
widening of the rates. 

Function sAbs works as follows: Since we use a variable 
order in which the variables encoding states are placed above 
the variables for the abstract states, each state s e S induces 
a path in the OBDD P which ends at the OBDD node that 
represents the abstract state of s. We follow the unique path, 
given by the encoding of s to the outer nodes labelled with 1. 
This yields the encoding of the abstract state of s. The runtime 
of sAbs is therefore linear in the number of OBDD variables. 

Let n = \S\ be the number of concrete states of the model 
under consideration, let k be the total number of positive 
transitions and let c be the number of OBDD variables. 
[Algorithm 3 [ visits each state and transition once. Since the 
block number variables are placed at the bottom of the variable 
order, accessing the block number of a state in function sAbs 
has a runtime of 0(c). From this, we have that the overall 



complexity of Algorithm 2 is 0{(n + k) ■ c). 

In the discussion so far, we assumed that it is already clear 
how the set of concrete states shall be divided into abstract 
states. We might however come across models where this is 
not clear, or where the results obtained from the abstraction are 
unsatisfactory. In these cases, we have to apply refinement, that 
is, split existing abstract states into new ones. For other model 
types, such refinement procedures already exist 131], |f32l. In 
the analysis types considered before, schedulers sufficed which 
fix a decision per state, and take neither the past history nor 



Algorithm 2: Compute ECTMC and reward structure from 
a given partitioning with OBDD b^s = (N, n-^i, h, I, v) of 
PM m — (Var, init, C, succ, Rc, Rf). 

1 ■)(■) — undefined 

2 !"(■, ■)(■) — undefined 

3 r,(-) r/(-) -00 

4 9I:={0, 1,...,|^|-1) 

5 approx(nn5,0) 

6 return ((?t,I^r'),(r,.,r/)) 



Algorithm 3: Procedure approx(n, level). 

1 if n = bddo then return 

2 else if level < leafLevel then 

3 //We are still at a variable level. 

4 X = varAtLevel(level) 

5 if n ?i bddi and x = v(n) then 

6 I n, l(n), ri/, := h(n) 

7 else 
I ri/ :- n, ri/, := n 

if X € 25 then 

3(x) :- 0, approx(n,, level + 1) 
3(x) :- 1, approx(n/,, level -1- 1) 
else 

s(x) :- 0, approx(n/, level + 1) 
s(x) := 1, approx(n/,,level -I- 1) 



9 
10 
11 
12 

13 
14 

IS else 

/ We have traversed all variable levels, 
rd?,) := max(rM, Rc(s)) 
r/(3) := msLx(rf(i),Rf(s)) 



C ■— Dom(succ{s,-)) // commands enabled in s 
a - {(c, sAbs(n>4j, s')) | c e C A 5' = succ{s, c)] 
A :— succ(s) 
A(-) := 

forall (s',A) e A do 
3' :- sAbs(n-;i, s') 

A(3'):= A(S') + ^ 
forall 3' e 9t do 

I^(3,a)(3'):=min(I^(3,a)(3'),A(^')) 
I"(3,a)(3'):=max(I«(3,a)(3'),A(3')) 



number of steps before the state was entered into account. Then, 
depending on the decisions of the scheduler per state, new 
predicates are introduced to split the state space. In our case, 
such simple schedulers are not sufficient to obtain extremal 
values, as has already been shown for the simpler case of 
time-bounded reachability ||39|| . Thus, it is not clear how to 
introduce predicates to split the state space. 

As a first heuristic, we do the following: We treat an OBDD 
representation of a PM as a labelled transition system, in which 
the commands play the role of the labels. We then use an 
existing algorithm to symbolically compute (non-probabilistic) 
strong bisimulations |62|, but stop the algorithm after a number 
of steps. This way, we obtain a partitioning in the form of 
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|Definition 19| As we will see later in |Section 4] although the 
method is not guaranteed to yield a good abstraction, it can 
work well in practice. 

The method discussed works for a very general class of PMs 
and arbitrary state partitionings. However, because it is based 
on explicitly visiting each concrete state at least once, it may 
take much time to perform for large models. To tackle this 
problem, a parallel implementation of the technique is possible. 
Given a computing system with a number of processors, one 
can symbolically divide the states of the model, such that each 
processor works on a different part of the OBDD representing 
the state space. Each processor can then process the model part 
it is assigned to. The only point of interaction is the widening 
of the rates and reward rates of the abstract model. On a shared 
memory architecture, one could use different semaphores for 
the reward rates and successor transitions of each abstract state 
to avoid delays. Without shared memory, the processors can 
compute partial abstractions independently, which are merged 
after the computations are finished. This technique is faster, 
but has the disadvantage of having to store several (partial) 
copies of the abstraction. If the state space is divided such that 
all states of an abstract state are assigned to a single processor, 
no locking is needed and the overhead is reduced. 

As an alternative to parallelisation, it should also be possible 
to use optimisation methods over BDDs 163], iMl, iSl, CHI 
to compute the rate and reward intervals symbolically rather 
than rely on explicit enumeration of all possible variable 
assignments. 

4 Case Studies 

To show the practicality of the method, we applied it on two 
case studies from classical performance and dependability 
engineering Il22l . ||23]| . We implemented the techniques of 



Algorithm 1 and Algorithm 2 To represent the ECTMCs, we 
used a sparse-matrix-like data structure. 

Where possible, we compared the results to Prism. Prism 
always starts by building an MTBDD representation of the 
model under consideration. The subsequent analysis is then 
performed using value iteration in the CTMC semantics 



similarly to [Algorithm 1| The data structure used here is either 

In 



an MTBDD, a sparse matrix, or a hybrid structure 
the latter, values for the model states are stored explicitly, but 
parts of the transition structure are stored implicitly. 

For all experiments we used a Quad-Core AMD Opteron™ 
Processor 8356 (of which we only used one core) with 
2300 MHz and 64 GB of main memory. 

4.0.0.1 Workstation cluster [22].: We consider a fault- 
tolerant workstation cluster. It consists in two sub-clusters, 
which, in turn, contain workstations connected via a central 
switch. The two switches are connected via a backbone. Each 
component of the system can break down, and is then fixed 
by a single repair unit responsible for the entire system. 

We are interested in the expected number of repairs until a 
time bound of t = 500, a property which can be expressed using 
cumulative reward rates. For up to 512, the model has been 
successfully analysed before using PrisivQ While the existing 



analysis methods worked well for model instantiations up to 
this and somewhat above, the techniques do not work well 
anymore for a very large number of workstations. Constructing 
the model using MTBDDs seems not to be problematic, but 
the subsequent analyses cannot be performed successfully. The 
sparse-matrix and the hybrid method fail at some point, because 
they rely on an explicit representation of the state space and thus 
run out of memory. Also the MTBDD-based value iteration fails 
at some point, and works rather slowly. The reason is probably 
that during the value iteration a large number of different non- 
terminal nodes appear, which make the MTBDD complex and 
thus large and slow to operate on. Detailed information about 
the performance of Prism on this case study is given in [Table 1] 
By \S\ and \R\ we give the approximate number of states and 
transitions, resp., of the original CTMC model. For each of 
the three Prism engines we give the runtime (columns "Time") 
and memory consumption (columns "Memory") for computing 
the expected rewards. An entry "- Time out -" means that 
Prism did not terminate within 160,000 seconds, an entry 
Memory out -" that more than 60 GB of memory are required 
to complete the analysis. 



In Table 2 we apply the method developed in this paper on 
several instantiations of the number of workstations A^. The 
results we obtained by our method are given in "ECTMC 
Results". Besides the runtime and memory consumption, we 
give in the column titled "["^Pl" the number of abstract states 
we used for the corresponding analysis. The column "Interval" 
gives the lower and upper bounds of the actual value of the 
expected reward. 

As we see from the time and memory usage, for smaller 
models it is advantageous to use an explicit-state method as 
implemented in Prism, because of the additional overhead our 
method introduces. As instantiations become larger, using the 
method of this paper becomes worthwhile. While we do not 
always get precise bounds for all analyses performed with this 
number of abstract states, we always were able to compute the 
order of magnitude. Interestingly, the value bounds get tighter 
with an increasing number of model states. 

As discussed in [Section 3] we apply a heuristical refinement 
algorithm based on bisimulations for labelled transition systems. 
We use the symbolic algorithm |62| for computing (non- 
stochastic) strong bisimulations to obtain a suitable state 
partitioning. We abort its fix -point iteration prematurely after a 
user-specified number n of iterations. In [Figure^ we show how 
the quality of the approximation evolves depending on n. The 
behaviour of the cluster case study is shown on the left. One 
can observe that the width of the computed interval converges 
quickly to the actual value when increasing the number of 
iterations. 

Notably, if we use the same number of refinement steps, 
for all model instantiations considered, |^| stayed constant, 
although the number of model states \S\ was different for each 
instantiation (cf. [Table 2[ ). 



1. http://www.prismmodelchecker.Org/casestudies/cluster.php#mc, Property 
R{ "num_repairs"}=?[ C<=T ]. 



Table 3 contains more detailed running times for the cluster 
benchmark with A^ = 2048 workstations using the ECTMC 
abstraction for different numbers of bisimulation iterations, 
which are given in the first column. The second column contains 
the number of abstract states. The running times in seconds are 
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TABLE 1 

Prism results for the number of repairs in tine workstation cluster until t - 500 













sparse 


engine 


hybrid 


engine 


symbolic engine 




N 


\s\ 


\R\ 


Time 


Memory 


Time 


Memory 


Time Memory 


Result 


32 


3.87- 


uf 


1.86 


lO"- 


9.29 


36.67 


14.90 


37.48 


13791.20 184.49 


64.17635 


64 


1.51 ■ 


105 


7.33 


10^ 


62.11 


42.92 


88.93 


41.69 


- Time out - 


127.98101 


128 


5.97- 


105 


2.91 


10'' 


380.51 


60.18 


585.55 


54.48 


— Time out - 


255.48297 


256 


2.37- 


10** 


1.16 


10' 


3 182.73 


141.71 


4737.73 


98.27 


— Time out - 


509.58417 


512 


9.47- 


10* 


4.62 


10' 


10540.54 


817.39 


14965.74 


284.66 


— Time out - 


896.80612 


1024 


3.78- 


10' 


1.85 


10^ 


13 242.08 


3 154.91 


25513.31 


1014.79 


— Time out - 


905.19921 


2 048 


1.51 ■ 


10« 


7.39 


10^ 


- Time out - 


- Time out - 


- Time out - 


?? 


4096 


6.04- 


10^ 


2.95 


10' 


- Memory out — 


- Time out - 


- Time out - 


77 


8192 


2.42- 


10^ 


1.18- 


lO'O 


- Memory out — 


- Memory out — 


- Time out - 


77 


16384 


9.66- 


10' 


4.72- 


lO'" 


- Memory out — 


- Memory out — 


— Time out - 


77 
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Fig. 2. Quality of the ECTMC approximation for different numbers of bisimulation iterations 



TABLE 2 

Number of repairs in the workstation cluster until time t ■■ 



500. 







ECTMC Results 




N 


m 


Time 


Memory 


Interval 


32 


19420 


107.15 


90.03 


[64.176, 64.199] 


64 


19420 


109.43 


86.28 


[127.980, 128.490] 


128 


19420 


115.93 


89.54 


[255.455,259.797] 


256 


19420 


132.89 


94.58 


[509.000, 580.485] 


512 


19420 


181.99 


91.87 


[869.749, 900.052] 


1024 


19420 


412.43 


107.22 


[905.018,905.200] 


2 048 


19420 


1 335.54 


103.31 


[905.766, 905.767] 


4096 


19420 


5 298.29 


104.89 


[905.955,905.955] 


8 192 


19420 


28 361.36 


132.48 


[906.040, 906.040] 


16 384 


19420 


147 691.30 


139.56 


[906.084, 906.084] 



given separately for the different main operations: The call to 
NuSMV to generate the OBDDs for the underlying transition 
system (col. 3), the given number of bisimulation iterations 
(col. 4), the construction of the ECTIVIC from the partitioning 
(col. 5), the value iteration to compute the reward interval 
(col. 6) and finally the total computation time (col. 7). The last 
two columns contain the memory consumption in IVIegabytes 
and the computed reward interval. 

4.0.0.2 Google file system ||23], ||66l.: We additionally 
consider a replicated file system as used as part of the Google 
search engine. Originally, the model was given as a generalised 
stochastic Petri net, but was transformed to a PM for the 
analysis. 

Files are divided into chunks of equal size. Several copies of 



each chunk reside at several chunk servers. There is a single 
master server which knows the location of the chunk copies. 
If a user of the file system wants to access a certain chunk of 
a file, it asks the master for the location. Data transfer then 
takes place directly between a chunk server and the user. The 
model describes the life cycle of a single chunk, but accounts 
for the load caused by the other chunks. 

The model features three parameters: M is the number of 
chunk servers, with C we denote the number of chunks a chunk 
server may store, and the total number of chunks is A^. 

We consider the minimal probability over all states in which 
severe hardware problems have occurred (master server is 
down and more than three quarters of the chunk servers are 
down), that within time t a state will be reached in which a 
guaranteed quality-of-service level (all three chunk copies are 
present and the master server is available) holds. This is a 
bounded-reachability property and thus based on final rewards. 

We fix C = 5000, = 100000 and t = 60 and provide 
results for several M in [Table 5| In the analyses with Prism 
(see [Table 4| l, we used an improved OBDD variable order, 
such that the performance results are better than in a previous 
publication |66|. In contrast to the previous case study. Prism's 
sparse matrix engine was faster and did not use more memory 
compared to the hybrid engine. The symbolic engine was again 
the slowest. The IVITBDD representation of the model requires 
more memory per concrete state compared to the previous case 
study. We assume that this is because the number of different 
rates occurring is much higher, and because some of the rates 
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TABLE 3 

Detailed experimental results for the workstation cluster (A^ = 2048, t = 500) using the ECTMC abstraction 









Running Times 










Iterations 


m 


NuSMV 


Refinement 


ECTMC 


Value Iter. 


Total 


Memory 


Interval 


5 


2440 


64.75 


0.93 


991.16 


13.45 


1070.34 


56.13 


[902.092,905.771] 


10 


9216 


68.46 


6.96 


1135.57 


41.68 


1252.79 


70.11 


[905.739,905.767] 


15 


19420 


65.25 


18.26 


1029.23 


116.89 


1229.78 


103.31 


[905.766,905.767] 


20 


34596 


73.93 


43.71 


1150.70 


213.72 


1482.29 


134.62 


[905.767,905.767] 


25 


52600 


69.82 


78.88 


1070.13 


321.43 


1540.49 


180.85 


[905.767,905.767] 


30 


76176 


66.43 


126.85 


1052.87 


466.31 


1712.76 


227.80 


[905.767,905.767] 



TABLE 4 

Prism results for the Google file system 
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- Time out — 
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10' 


- Time out - 


- Time out - 


- Time out — 


?? 


2 048 
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10' 


1.78- 


10» 


- Time out - 


- Time out - 


- Time out — 


?? 



TABLE 5 

Google file system with property 12 and t 
C = 5000. 



60, 100000, 







ECTMC Results 




M 


ri5i 


Time 


Mem. 


Interval 


32 


6138 


30.28 


72.28 


[0.0000, 0.0000] 


64 


18 042 


251.86 


86.53 


[0.5071,0.5071] 


128 


29515 


822.08 


142.23 


[0.5071,0.5071] 


256 


29515 


1531.46 


147.66 


[0.5071,0.5071] 


512 


29 515 


2951.37 


129.72 


[0.5071,0.5071] 


1024 


29515 


6776.42 


128.40 


[0.5071,0.5071] 


2048 


29515 


14957.94 


137.47 


[0.5071,0.5071] 



are obtained by multiplying state variables, thus leading to a 
more complex MTBDD structure. Notice that in this model, 
from a certain value of M onward, the probability discussed 
is almost the same. 

We give detailed information on the instance with M = 128 



of the Google file system in Table 6 and Figure 2 (right-hand 
side) for different numbers n of bisimulation iterations. We 
can again observe that the computed interval for the bounded- 
reachability property quickly converges to the actual probability 
with increasing n. 

5 Conclusion 

We developed an efficient method to compute extremal values 
of CTMDPs over HR schedulers. It can be used to safely 
bound quantities of interest of CTMCs, by abstracting them 
into a special class of CTMDPs, and then applying this method. 
Experimental results have shown that the approach works well 
in practice. 

There are a number of possible future works: The current 
refinement technique surely does not yield an optimal parti- 
tioning of the state space in all cases. We thus want to see 
how the scheduler we implicitly obtain by [Algorithm 1| can 
be used to refine the model. The abstraction technique could 
also be extended to other property classes and models. For 



instance, models already involving nondeterminism could be 
abstracted and approximated using Markov games ll67l . 1681 . 
[691 . It would also be interesting to see how a parallelised or 
symbolic abstraction method sketched at the end of [Section 3| 
performs against the one currently implemented. Using a three- 
valued logic f2E\, the technique could also be integrated into 
an existing probabilistic (CSL) model checker. 

5.0.0.3 Acknowledgements.: We thank Martin 
NeuhauBer and Lijun Zhang for fruitful discussions. 
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Appendix A 

Proof of [Proposition 11 

We need some additional notations and lemmata to complete 
the proofs of the propositions in the paper 

Definition 21: Let D = (5,P) be a DTMC. For sq, e S 



and e N we define 7t^(sq, k, s^) - Pr(X 



D,sa 



Definition 21 specifies the transient probability to be in state 
Sk at step k when having started in state so- 

Corollary 1: Notice that for D -{S,V), e N and sq, Sk e 
S it is 

;t®(so, k, Sk) 

2 P(S0, ^l) ■ Yj ^2) ■ Yj P(^2, ^3) 



J]e5 S2^S St,£S 



that is, the transient probability in a DTMC can be expressed 
using matrix multiplications. 

We extend the definition of values to discrete-time models, 
which will be used in the further parts of the proof 

Definition 22: Let £) = (5,P) be a DTMC, let r = (rc,r/) 
be a reward structure and let u, t > 0. We define 

V(£), so,r,t,u) 



So, i, Si)rf(si) 

i=0 V Sii=S 

+ <Aut(0 y, ^(so, i, s,)^^^ 



We define a type of schedulers which are simpler than the 



HRs of Definition 7| and at the same time generalise the CD 



of [Definition 91 

Definition 23: A time-abstract, history-abstract, counting, 
randomised scheduler (CR) for a DTMDP D - (S,Act,P) or a 
CTMDP C = (S,Act,R) is a function cr: (5 xN) ^ Distr(Act) 
such that for all s e S and n e N, if o-{s,n){a) > then 
a e Act{s). With Scr we denote the set of all CRs. 

Definition 24: Assume we are given a CTMDP C - 
iS,Act,R) and a CR cr: (5 X N) Distr(Act). We define 
the induced CTMC as = (S',R') with 

. S' = 5 xN, 

• R'((s,n), (s',n + 1)) = Zo-eio ' R(s, ff, s') for 

s,s' eS and n € N, and R'(-) = else. 
Let x'^-^o : (Hc^ x R>o) (S xM) be the stochastic process 
of the CTMC C,r and let /: (5 x N) ^ 5 with f(s,n) = s. 
Induced DTMCs of DTMDPs are defined accordingly. The 
induced stochastic process X*"'""'*" : (Qc., x R->o) ^ 5 of C and 
o- starting in so e 5 is then defined as if''^'"' = / o for 
f e R>o. Definitions for DTMDPs are likewise using P instead 
of R. 

We extend the notation of transient probabilities to scheduled 
nondeterministic models. It is known that for DTMDPs CR 
schedulers are as powerful as HR schedulers. 

Definition 25: For a DTMDP D = (5,Acf,P) and a sched- 
uler o- e •LhrU'Lcd^I.cr we define 7r®''^(so, k, Sk) = Pr(xf = 
Sk) for all e N and sq, s^ e 5. 



Lemma 1: Consider a DTMDP £) = (S,Act,F) and a HR 
(Tkr- Then there is a CR cr^,,. such that for all so, s„ e S and 
n e N we have tt^'"''" (sQ,n, s„) - 7iP'°'"{sQ,n, s„). 

Proof: The proof is given in ITTOI and |l40l Theorem 5.5.1] 
where CR are denoted as MR (Markov randomized) policies. 

□ 

Definition 26: For a CTMC C = (5,R), we let emb(C) =' 
(5,P) denote the DTMC such that for all s, s' e 5 we have 

The following lemma states how values of CTMCs can be 
computed using the embedded discrete-time model. 

Lemma 2: Let C — (5,R) be a CTMC with a reward 
structure r — iXcVf). Let u — u(C). Then for t > and 
all So e 5 it is 

V(C, So, r, t) = V(emb(C), so, r, t, u) 

Proof: By definition, for so e 5 it is 



V(C, so,r,t) = E 



^jA^,'"')Au +E[r/(xf-"')l. 



accumulated 



final 



We have 



final 
=E[r/xf-^'')l 
= ^Pr(Zf-''' = s)r/(s) 



seS V/=o 

oo 



/=0 seS 

Further, Kwiatkowska et al. 



accumulated = Y '/'ut(0 Y ^™''*'^^(*o, i, s) — 

/=() seS ^ 



Theorem 1] have shown that 
rds) 

n (.so,i,s) 

seS 



Thus, 



V(C, so, r,t)^Y '^"t(') Z '^™'*'^'(*o, /, s)rf(s) 

i=0 V seS 



.emb(C) 



(so, /, s) 



r,(s) 



.seS 



V(emb(C), so,r, t, u). 



□ 

We can now show that the restricted class CR suffices to 
obtain optimal values. 

Lemma 3: Given a CTMDP C = (S,Act,P) and cr,„ e 
there is cr„. e Zcr such that V(X'^''^""'°,r,t) = V(X'^''^"-'°, r, t). 
Further, for all cr^^ e we can find cr^^^ e S/^j? such that 
V(xC,<,-,^o , r , t) = V(X'^''^''"-'° , r, t). 

Proof: Consider a CTMDP C = (5,Acf,P) and cr,,^ € Xhr- 



By Lemma 1 we can find a scheduler 0-^^ e Scj? such that 



^emb(C),o-ir _ ^emb(C),o-(.r 



(1) 



Define 
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f 

. s) = r,(i) and 
. rJ(J3,s) = rf(s) 
and let 

. r''' = (rl'\ rj) with 

• r"(i, n) = rc(i) and 

. r^''(s,n) " Tfis) 
for p € {S X Act)*, n e N and 5 e 5. Then for all so e S we 
have 

E[r/(xf'""-'°)] 
=E[r//(<""''°))] 
= ;^Pr(/(<""-'°) = ^)ry(i) 

= J]Pr(3;8e (5 xAcf)*. xf'""''" = 08,i))r/(i) 
= 2 Pr(xf-'"-'° =(;8,*))r/(*) 



= V(emb(C)^„.,io,r'^',t,u). 

From these facts, we have 
V(X^''^"'-'°,r,t) 
ISpv(X'^-'"-'^r''^t) 
'^V(emb(aj,^o,r"M,u) 
'^V(emb(CV,„.,io,r'"-,t,u) 



(6) 



=E[r)''(zf'"'''°)] 
and similarly 



and in turn 



:^E[r,(X^---'»)] d« 
j^E [r^CX^'"-' 

Jo 

V(X^''^""'°, r, t) = VCX^"'--'", r''^ t). 



'^V(emb(C^„.),io,r'",t,u) 

| L"2g"- ly(-j^^„.,io j.er j-j 
l!EIly(-T^C,(T„.,Jo ,. J) 

If we start with a CR cr^,., we can define the HR cr^^^ such 
that for /3 e (S xAct)" with n e N we have o-'j^i^ifi, s) = cr^.r(5, «) 
and then the result can be shown in the same way. □ 

From [Lemma 3] we can conclude that the maximum is 
obtained by a scheduler in Scs- 

Corollary 2: Given a CTMDP C and reward structure r, for 
all ^0 e 5 it is 

¥'^"(0, sq, r, t) = max V(X^''"'"°, r, t). 
Because of [Corollary 2| we only have to show that [Alga] 



rithm 1 computes the maximum over all cr e Scr up to the 



specified precision. 

We first show that to compute the value of a given CTMDP 
for a given scheduler up to a required precision, it suffices to 
consider a limited number of steps in the embedded model. 

Lemma 4: Given a CTMDP C = iS,Act,P) with a reward 
(2) structure r = (rc,r/), a precision £ > and k such that 



In the same way, using n e N instead of yS e (5 x Acf)*, one 
can show 



y(x' 



-C,a-„,so 



,t) = \(X^'"'">,r",t). 



Notice that for cr e Zcr U Hhr it is 

emb(Co-) = emb(C)a-. 

In addition. 



(3) 



(4) 



£ ^„t(«) > ut - ^ and ,Aut(fe) ■ r'p < ^, 
then for all schedulers cr e l^cR ^nd all sq e S we have 
2 <^ut(02'^""''''''"(^o,U)r/(.) 



(=!(:+ 1 V 



V(emb(CU ,^o,r''M,u) 



^2 ^"'(') Z ^'="**'''""(^o,^^)-r'r(^) 



< e. 



/=() V se(SxAa)'xS 



u 

se(Sx/\cO*xS 

00 / 



(5) 



r,(^) 



seS 
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Proof: Consider a CTMC C = (5',P') with rewai'd 
structure r' = (r^, r'^) and 

V(C',5o,r',t) = E 



j|r^(Xf-'°)dM +E[r}(xf'-'°)1 



final 



accumulated 

for any io £ 5'. It is known 1531 remark below Theorem 2] 
that if 

k 

en 



> ut 



then 



n=0 



accumulated 



2r7 



n=/t+l 
E 



and if 



then 



seS 



.1. n.\ -.'max , 



final 



i=k+\ seS' 



= 2 ^ut(0r7 

i=k+l 

/IN max 

e 



Thus, 



i=/t+l V 



^ u 



^accumulated + final 



(7) 



Now with r' = (r^, where r[.(s, n) - rds) and r'f{s, n) 



r/(i), because rjf 

oo 



i=k+l 



u 

oo / 

i=k+l V je5xN 

u 



□ 



Algorithm 4: Compute value of C = (S,Act,R), reward 
structure (rc,r/) and CR cr up to precision s. 

1 let k s.t. Y.Lo 'Aut(n) > ut - jfi. A lAut(^) ■ r™" < f 

2 C = (5,Acf,P) := emb(C) 

3 forall i € 5 do qk+i(s) := 

4 forall i = ^, ^ - 1, . . . , do 



forall i e 5 do 



r<-(i) 



8 return 



We can show that Algorithm 4 computes the values of a 
CTMDP given a certain scheduler up to a required precision. 

Lemma 5: Consider a CTMDP C - (S,Act,R), a time 
bound t, a sc heduler cr e "Lcr, and let q be the retum value of 
Algorithm 4| Then for all sq e S it is |g'(io)-V(X'^''^''°,r,t)| < e. 



It is 



Proof: Consider the values q = returned by [Algorithm 4[ 



?(«o) 

=0ut(O) ■ r/(5o) + lAut(O) 

u 

+ ^ o-(io, OXao) ^ P(>?o, «(), ii) 

rc(^i) 



aoeAct sieS 

•(<^ut(l)-r/(5i) + iAut(l) = 



u 



=0ut(O) ■ r/(5o) + lAut(O) 



+ 2 cr(ii,l)(a,)^ P(ii,a;i,i2)- 



u 



^ o-(io, 0)(ao) ^ P(io, ao, si)(4>m(D ■ r/(^o) 



aoeAcl 



sieS 



+ l/'ut(l) ) 



+ o-(so, 0)(a'o)^ P(5o, Q-o, ii) ^ o-(su 1) ■ 



(8) 



sieS 



aieAct 



i=0 V seS 

■Hi' U 



=V(emb(C),io,r,t,u)- 

OO / 



i=i+l V 



'^^l%(emb(C),.o,r,t,u)-£' 



for some e' with < e' < e, and thus \q{so)-\{X'^''^-'° ,r, t)| < s. 

□ 
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The value obtained from | Algorithm 1 will be no smaller than 



the one obtained from applying [Algorithm 4| on an arbitrary 
scheduler 

Lemma 6: Consider a CTMDP C - (S,Act,R), a time 
bound t, an arbitrary scheduler cr e 1,cr, let q be the return value 



of Algorithm 4 and let q' be the return value of Algorithm 1 
Then for all sq e S it is q{so) < q'{sQ). 



Proof: Let q, be as given in Algorithm 4 and let q'^ be the 



corresponding vector of [Algorithm 1| We show the lemma by 
backward induction on the program variable /. 

Induction start: i = k + I: Before the main loop at the lines 
3, both algorithms assign qk+\is) - = for all s e S . 

Induction assumption: Assume it is qi+iis) < q'j_,_i{s) at the 
beginning of the main loops, that is before lines 4. 



Induction step: Consider m of Algorithm 4 and corresponding 



m' of Algorithm 1 after the assignment to this variable at line 
6. It is 



aeAct{s) 

< max 

aeAct(s) 



^ o-(s,k)(a)Y^Pis,a,s')qi+i(s') 

s'eS 

'^P(s,a, s')qi+iis') 

S'eS 

< max ) P(s,a,s')q'i^iis') 



S'eS 



Because lines 7 are identical in both algorithms, also qj < q'. 
at the end of the main loops. □ 
With these preparations, we can now prove the first part 



of Proposition 1 



Lemma 7: Consider a CTMDP C = {S,Act,R), a reward 
structure r, a time bound t and let q' be the return value of 
Algorithm 1| Then for all e 5 it is |^'(,so)-V™''(C, so, r, t)| < 



Proof: Let cr e I,cr be such that 

V(X^''"-'», r, t) = V^^CC So, r, t). 



(9) 



Then because of Lemma 5 


it is \q(so)- 


q is the return value of A 


gorithm 4 



know that q(so) < q'iso)- By adding the assignment 



arg max 

aeAci(s) 



^P(i,Q', s')qi+i(s') 



into the inner loop of Algorithm 1 after Line 7 we can obtain 

€ I.CD- Consider cr^^ e Scr 





a prefix of the scheduler cr'^^ 

such that (7^^(i, i){a) = I if cr'^^is, i) - a and cr^^(i, i){a) 
else. It can easily be shown that applying 
also yields the value q'. Thus again by 
l^'(so)- V(X'^''"-'»,r,t)| <e. 



Algorithm 4 



Lemma 5 



on cr^^ 
we have 

□ 

We can now show that deterministic schedulers suffice, by 
using the fact that [Algorithm 1 1 indeed maximises only over 
this class. 

Lemma 8: Let C = iS,Act,R) be a CTMDP with rewai'd 
structure r - (TcTf). Then there exists cr e "Zcd such that 
V^^CC so, r, t) = V(X^''"'"°, r, t) 

Proof: Assume the lemma does not hold. Then for each 
<T € "LcD there is a e > such that for some state sq e S 
we have |V(X^-'^-'°, r, t) - ¥'"""(0, sp, r, t)| > s. Now consider 



using the sa me required precision e. By the correctness of 
Algorithm l| we then have |V(X'^''^'-'», r, t) - ¥^"(0, sq, r, t)| < 
e. This contradicts the assumption, because with the extension 



in the proof of Lemma 7 the algorithm also computes a CD 
which obtains this precision. □ 
The idea of using the fact that the optimising algorithm 
computes a scheduler of a more restricted class than the 
class considered was adapted from various proofs for similar 
problems in the discrete-time setting l,40J . 

Appendix B 



Proof of [Proposition 21 

Consider (5,P) = D = emb(C) and (5,P') = O' =' emb(C')- 
We define the HR cr: ((% x Act)* x'il) Distr(Act) so that for 
yS = 3off.so3iQ'i, ■ ■ ■ a.s„-i3n we have 

crm^s) " Pr(Xf - s I Zf A "/\ ••'» = 

1=0 

By induction, for all n e N and 3 e ?I we have 
Pr(X,f e s) = Pr(Xf' = 3). 



Thus, using Definition 22 Definition 24 and the definition of 
the reward structures r, r' it is 

\(D, so, r, t, u) < 30, r, t, u), 

and thus using Lemma 2] it is 

\(X^''\r, t) < V(X^'''^'*, r', t) < V™"''(C', 30, r', t). 

Appendix C 

Proof of [Proposition 3[ 

We only show that by using a finer abstraction the maximal 
bound cannot increase. The case for the minimal bound is 
likewise. We will show that for each e > it is 



¥"'"(0, r, t) + e > V'""''(C', sij, r', t). 



(10) 



This then also shows that the inequation holds for e — 
0: If V^^CCs^r,!) < V™''(C',3;^.,r',t) then e' = 
V™"(C',3,'^.,r',t) - ¥^"(0, 3,,r,t) is positive. By substracting 
V™(C, 3f, r, t) from [Equation 10| we have 



E > B 



this is a 



As this equation must hold for all e, for instance |- 
contradiction. 

We show [Equation 10 To do so, we show by a backward 
induction on Algorithm 1[ that for the value ^0 obtained for 
C and the value q' obtained for C we have q > q[y using a 
precision of e. By the precision guarantee of the algorithm, 
this in turn shows the equation. 

Induction start: i - k+\: Before the main loop, at the [Line 3] 
both runs assign qk+\{l,) = q'u^MJ ^ ^ e . e W ■ 

Induction assumption: Assume it is <7,+i(Sr) > ?J+i(3,'j) for 
all 3, e ''13, 3j^. e ^' with 3, - [J"^, 3,'^. at the beginning of the 



main loops, that is before Line 4 



Induction step: Consider m for 3, of C and corresponding m' 



e ScD obtained by [Algorithm 1] as in the proof of [Lemma 7[ for 3;^. of C after the assignment to this variable atlLine 61 Let 
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{&' ,a') be a maximising decision for some ^J^. at this line. By 
the definition of the abstraction, for 3, we find a corresponding 
{a, a) such that 

• Dom(a') — Dom{a), 

. for each c e Dom(a') we have a'{c) - and a{c) - 

(Sv, I) with ^'^j c 3„. 
Then it is 

m' = max yP(^,j,(a',a'),^[,,)qi+i{^[,) 

{a',o')£Acl{i' ) f—f 

= 2 P(s;,,.,(a',a'),3;,,)*+i(3:,,) 

< ^ P(3f, (a, a), 3v.)?i+i(S..) 

< max y P(3„ (a, a), (s.) 

(u',a)£Aef(5,) 
= OT. 



The rewards for the refined abstraction cannot be larger than 
the ones for the coarser one. Thus, after [Line 7| we still have 
q-, > q'l at the end each iteration. ■ 



18 



